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Database indexing using a tree structure 

Datenbankindizierung unter Benutzung einer Baumstruktur 

Indexage d f une base de donnees en utilisant une structure en arbre 
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EXTENDED DESIGNATED STATES: AL; LT; LV; MK; RO; SI 
INTERNATIONAL PATENT CLASS: G06F-017/30 



ABSTRACT EP 1207464 A2 

A database of data items, such as images or video, is indexed by using 
a tree structure. Each node (302, 304, 306, 308, 310, 312, 314, 316, 318, 
320) of the tree structure relates to a. region (202, 204, 206, 208, 
2061) ) + 2062) ) + 2063) ) , 

2064 (underscore) ) ) 2+2064 (underscore) 3) ) +2064 (underscore) 4 ) ) , 

2064 (underscore) 1) ) , 2081)), 2082) ) +2083) ) +2084 )) ) in a feature vector 

region. At least one of the terminal nodes (304, 312, 314, 316, 318, 320) 

of the tree indexes a composite region (2061 ) )'+2062 )) +2063 )) , 

2064 (underscore) 2) ) +2064 (underscore) 3 ) ) +2064 (underscore) 4 ) ) , 

2082) ) +2083) ) +2084) ) ) formed by combining a plurality of low population 

regions from the same index level. 

ABSTRACT WORD COUNT: 93 

NOTE: 

Figure number on first page: NONE 

LEGAL STATUS (Type, Pub Date, Kind, Text) : 

Application: 020522 A2 Published application without search report 

LANGUAGE ( Publication, Procedural , Application ) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language Update Word Count 

CLAIMS A (English) 200221 1580 

SPEC A (English) 200221 2492 
Total word count - document A 4072 
Total word count - document B 0 
Total word count - documents A + B 4072 



.SPECIFICATION Furthermore, quicker retrievals can be provided by 
repeatedly updating the predetermined threshold value e used in the 
similarity measurement. 

That is, a high-dimensional feature vector space indexed by the 
method of indexing a feature vector space according to the present 
invention can support functions such as similarity search , retrieval 
or browsing in a salable and efficient manner. Thus, even if the size of 



a database increases, the time required for similarity search and 
retrieval does not increases as much. 

Furthermore, the method of indexing and searching a feature vector 
space according to the present invention can be... 
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Adaptive search method in feature vector space 
Angepasste Suchmethode im Eigenschaf ten-Vektorraum 

Pro cede de recherche adaptive dans un e space de vecteurs de 
caracteristiques 
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ABSTRACT EP 1205856 A2 

An adaptive search method in feature vector space which can quickly 
search the feature vector space indexed based on approximation for a 
feature vector having features similar to a query vector according to a 
varying distance measurement is provided. The adaptive search method 
includes the steps of (a) performing a similarity measurement on a 
given query vector within the feature vector space, and (b) applying 
search conditions limited by the result of the similar measurement 
obtained in the step (a) and performing a changed similarity 
measurement on the given query vector. According to the adaptive search 
method, the number of candidate approximation regions is reduced during a 
varying distance measurement such as an on-line retrieval, which improves 
the search speed. 

ABSTRACT WORD COUNT: 121 

NOTE: 

Figure number on first page: NONE 

LEGAL STATUS (Type, Pub Date, Kind, Text): 

Application: 020515 A2 Published application without search report 

LANGUAGE ( Publication, Procedural , Application) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language Update Word Count 

CLAIMS A (English) 200220 536 

SPEC A (English) 200220 1889 
Total word count - document A 2425 
Total word count - document B 0 
Total word count - documents A + B 2425 



. . .ABSTRACT A2 

An adaptive search method in feature vector space which can quickly 
search the feature vector space indexed based on approximation for a 
feature vector having features similar to a query vector according to a 
varying distance measurement is provided. The adaptive search method 
includes the steps of (a) performing a similarity measurement on a 
given query vector within the feature vector space, and (b) applying 
search conditions limited by the result of the similar measurement 
obtained in the step (a) and performing a changed similarity 
measurement on the given query vector. According to the adaptive search 
method, the number of candidate approximation regions is reduced during a 
varying distance measurement such as an on-line... 
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Image object ranking 
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ABSTRACT EP 1193648 Al 

Automatic vision system object indexing and image database query system 
using both path-dependent and path-independent features of moving objects 
within a sequence of images. Feature vectors of both average over frames 
of an object traversing the field of view plus average over blocks of a 
grid for a path association. Color histograms may be an included feature. 

ABSTRACT WORD COUNT: 58 

NOTE: 

Figure number on first page: 1 



LEGAL STATUS (Type, Pub Date, Kind, Text) : 
Application: 020403 Al Published application with search report 

Examination: 021204 Al Date of request for examination: 20021004 

LANGUAGE ( Publication, Procedural , Application) : English; English; English 

FULLTEXT AVAILABILITY: 

Available Text Language Update Word Count 

CLAIMS A (English) 200214 283 

SPEC A (English) 200214 6925 
Total word count - document A 7208 
Total word count - document B 0 
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...SPECIFICATION the 10% mark are listed in table 5. 



Modifications 

The preferred embodiments may be modified in various ways while 



retaining the aspects of video object indexing with feature vectors 
plus grid block sequences reflecting objects 1 paths of traversing the 
field of view, and the query method of feature vector and grid block 
sequence similarity searching metrics (including color histogram) for 
finding objects of interest. 

For example, the path-independent and the path-dependent features could 
be varied; the averaging of... 
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ABSTRACT EP 1160690 Al 

In the field of indexing multidimensional data, there has not been a 
satisfactory data structure to support the nearest neighbor (NN) -search 
efficiently when the feature vectors are not uniformly distributed. 

A method of indexing data elements in a feature vector space 
comprises assigning data elements to first level index terms with a 
first course granularity and identifying first level index terms for 
which there are concentrations of elements and extending these first 
level index terms to provide a finer grained index term for the elements 
making up the concentrations of elements. 

Additionally a method of searching for similarity in a feature 
vector data space is provided. 

ABSTRACT WORD COUNT: 109 

NOTE: 

Figure number on first page: 1 



LEGAL STATUS (Type, Pub Date, Kind, Text): 



Application: 
Examination : 
Examination : 

Change : 



011205 Al Published application with search report 
020424 Al Date of request for examination: 20020211 
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LANGUAGE ( Publication, Procedural, Application) : English; English; English 
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Total word count - document A 
Total word count - document B 
Total word count - documents A + B 
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Word Count 
715 
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2292 
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search in a feature vector space 



. . .ABSTRACT the field of indexing multidimensional data, there has not been 
a satisfactory data structure to support the nearest neighbor (NN) -search 
efficiently when the feature vectors are not uniformly distributed. 

A method of indexing data elements in a feature vector space 
comprises assigning data elements to first level index terms with a 
first course granularity and identifying first level index terms for 
which there are concentrations of elements and extending these first 
level index terms to provide a finer grained index term for the elements 
making up the concentrations of elements. 

Additionally a method of searching for similarity in a feature 
vector data space is provided. 

...SPECIFICATION distribution of feature vector data in a feature vector 
data space to cope with the concentration of feature vector data. 

A method of performing a similarity search on a feature vector 
data space which has been hierarchically indexed according to the 
indexing method of a feature vector space described with reference to 
Figure 1, will now be described. 

Feature vectors in each cell on which feature vectors are concentrated 
in the feature . . . 

...CLAIMS in each cell, on which feature vectors are concentrated, using 
the vector approximation file and a corresponding sub-vector 
approximation file. 

15. A method of searching for similarity in a feature vector data 
space in which feature vectors are indexed , the method comprising 
the step of (a) performing a similarity search in the feature 
vector data space which has been indexed by determining whether 
cells on which feature vectors are concentrated exist and 
hierarchically indexing feature vectors in the cells on which it 
is determined that feature vectors are concentrated according to a 
predetermined indexing method. 

16. The method of claim 15, wherein the step (a) is performed based on a 
nearest neighbor search. 
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WOODWARD Robert, 1828 Rheem Court, Pleasanton, CA 94588, US, US 

(Residence), US (Nationality), (Designated only for: US) 
QUERTERMOUS Thomas, 44 El Rey Road, Portola Valley, CA 94028, US, US 

(Residence), US (Nationality), (Designated only for: US) 
JOHNSON Frances, 44 El Rey Road, Portola Valley, CA 94028, US, US 
(Residence), US (Nationality), (Designated only for: US) 
Legal Representative: 
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(EP) AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR 
(OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 
(AP) GH GM KE LS MW MZ SD SL SZ TZ UG ZW 
(EA) AM AZ BY KG KZ MD RU TJ TM 
Main International Patent Class: C07H-021/04 
International Patent Class: C12Q-001/68 
Publication Language: English 
Filing Language: English 
Fulltext Availability: 
Detailed Description 
Claims 

Fulltext Word Count: 314482 



English Abstract 

Leukocyte gene expression profiling is utilized to identify 
oligonucleotides from gene expression candidate libraries. The expression 
libraries are generally immobilized on an array. Diagnostic 
oligonucleotide sets for analysis of leukocyte-related diseases are 
described. 



French Abstract 

L' invention concerne l 1 evaluation du niveau d' expression genique d f un 
leucocyte utilise pour identifier des oligonucleotides a partir de 
bibliotheques candidates d f expression genique. Ces bibliotheques 
d 1 expression sont generalement immobilisees sur une mat rice. L 1 invention 
concerne egalement un oligonucleotide de diagnostic regie de facon a 
analyser des maladies associees a un leucocyte. 

Legal Status (Type, Date, Text) 

Publication 20020725 A2 Without international search report and to be 

republished upon receipt of that report. 

Search Rpt 20020926 Late publication of international search report 

Republication 20020926 A3 With international search report. 

Republication 20020926 A3 Before the expiration of the time limit for 

amending the claims and to be republished in the 
event of the receipt of amendments. 

Search Rpt 20020926 Late publication of international search report 

Examination 20030213 Request for preliminary examination prior to end of 

19th month from priority date 

Correction 20030912 Corrected version of Pamphlet: pages 1/10-10/10, 

drawings, replaced by new pages 1/11-11/11; due to 
late transmittal by the receiving Office 

Republication 20030912 A3 With international search report. 



Fulltext Availability: 



Detailed Description 



Detailed Description 
. . . A SOLID 

SUBSTRATE FOR USE IN NUCLEIC ACID HYBRIDIZATION ASSAYS" to Bahl 
et al., issued June 1 @ 1993; US Patent No. 5,707,807 "MOLECULAR 
INDEXING 

FOR EXPRESSED GENE ANALYSIS" to Kato, issued January 13, 1998; US Patent 
No. 5,807,522 "METHODS FOR FABRICATING MICROARRAYS OF 

BIOLOGICAL SAMPLES' 1 to Brown ... multisequence file with the appropriate 
labels for each clone in the headers for subsequent automated analysis. 

104 

Initially, known sequences were analyzed by pair wise similarity 
searching using the blastn option of the blastall program obtained from 
the National Center for Biological Information, National Library of 
Medicine, National Institutes of Health (NCB1... 

...were removed from the sequences. 

Messenger RNA contains repetitive elements that are found in genomic DNA. 

These repetitive elements lead to false positive results in similarity 
searches of query mRNA sequences versus known mRNA and EST databases. 
Additionally, regions of low information content (long runs of the same 
nucleotide, for example) also result in. . .design a probe for expression 
analysis and further approaches are taken to identify the gene or 
predicted gene that corresponds to the cDNA sequence, including 
similarity searches of other databases, molecular cloning, and Rapid 
Amplification of cDNA Ends (RACE) . 

In some cases, the process of analyzing many unknown sequences with 
BLASTN was... the peptide-predicting algorithms used to create the two 
sequences, but the homology between the two is significant. 

BLASTP and TBLASTN were also used to search for sequence similarities 
in 

the SWISS-PROT, TrEMBL, GenBank Translated, and PDB databases. Matches to 
several proteins were found, among them a tumor cell suppression protein, 
HTS 1. . . 

...used to conduct farther domain and motif analysis. The Prosite search 
generated many recognized protein domains. A BLASTP search was perfortned 
to identify areas of similarity between the protein query sequence 
and PRINTS, a protein database of protein fingerprints, groups of motifs 
that together form a characteristic signature of a protein family. In 
this case... PDB databases. No significant matches were found in any of 
these, so no gene identity or tertiary structure was discovered. 

The peptide sequence was also searched for similarity to known 
domains and motifs using BLASTP with the Prosite, Blocks, Pfam, and 
ProDorn databases. The 

searches produced no significant aligninents to known domains. BLASTP... 
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Fulltext Availability: 

Detailed Description 

Claims 

Fulltext Word Count: 9993 
English Abstract 

A database accessing system for processing a request to access a 
database including a multiplicity of entries, each entry including at 
least one word, the request including a sequence of representations of 
possibly erroneous user inputs (10), the system including a similar word 
finder operative (30), for at least one interpretation of each 
representation, to find at least one database word which is similar to 
that interpretation, and a database entry evaluator operative (50), for 
each database word found by the similar word finder, to assign similarity 
values for relevant entries in the database, the values representing the 
degree of similarity between each database entry and the request. 

French Abstract 

L f invention concerne un systeme d'acces a une base de donnees, 
permettant de traiter une demande d'acces a une base de donnees 
commprenant une multiplicite d' entrees, chaque entree comprenant au moins 
un mot, la demande comprenant une sequence de representations de donnes 
(10) d' utilisateur pouvant etre erronees. Le systeme comprend un 
identif icateur (30) de mots similaires, capable d'effectuer au moins une 
interpretation de chaque representation, de maniere a trouver au moins un 
mot de la base de donnees qui est similaire a cette interpretation, et un 
evaluateur (50) des entrees a la base de donnees pour chaque mot de la 
base de donnees trouve par 1 ' identif icateur de mots similaires, 
permettant d'attribuer des valeurs de similarite a des entrees 
pertinentes dans la base de donnees, les valeurs representant le degre de 
similarite entre chaque entree de la base de donnees et la demande. 

Fulltext Availability: 
Detailed Description 

Detailed Description 

complete original alphabet . 

Each layer of the similarity search index 

contains the same dictionary words but in a different 

reduced alphabet. 

Each word in the similarity search index is 

represented in vector format with a reduced alphabet, A lines. 
Fig. 13 illustrates an example of a "grapheme 
based" similarity search index for an English language 
dictionary. It... 
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ABSTRACT 

PROBLEM TO BE SOLVED: To provide a method of quickly retrieving the 
characteristic vector having the characteristic similar to the query 
vector by measuring a distance variable in a vector space indexed on 
the basis of the approximation. 

SOLUTION: This adaptive retrieving method includes (a) a stage for 
measuring the similarity in the characteristic vector space with respect 
to the given query vector , and (b) a stage for measuring the changed 
similarity with respect to the given query vector by applying a 
retrieving condition limited by a result of' the measurement of the 
similarity obtained in the (a) stage. As the number of the candidate 
approximate areas is small in measuring the variable distance such as 
on-line retrieving, a retrieving speed can be improved. 
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ABSTRACT 

PROBLEM TO BE SOLVED: To extremely accelerate the calculation of similarity 
and to enable the application of an interactive style by obtaining a 
reduced feature vector corresponding to each frame of a video and by 
calculating a similarity score while using this reduced feature vector 



and statistical models. 



SOLUTION: A video feature 208 is selected out of the matrix of 
transformation coefficients as a transformation coefficient at a 
coefficient position inside a transformation matrix shown as a video set 
for video classification. A classifier 206 receives the respective video 
features 208 and inputs these video features 208 to respective image 
class statistical models 202-205. As a result, the respective frames of a 
video file 201 are classified into image classes expressed by the image 
class statistical models 202-205. The corresponding image class 
determined so as to correspond to the frame of the video file 201 by the 
classifier 206 is indexed to a video 207 with a labeled class. 
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Media objects classification system e.g. for digital audio files 
calculates query vector using index calculated after associating 
subsets of media objects into clusters 

Patent Assignee: COHEN M (COHE-I) 

Inventor: COHEN M 

Number of Countries: 001 Number of Patents: 001 
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Patent No Kind Date Applicat No Kind Date Week 
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Abstract (Basic): US 20020184193 Al 

NOVELTY - An electronic processor associates subsets of media 
objects into one or more clusters of dissimilar objects for calculating 
at least one index of one cluster. The similarity of the query 
vector is calculating using the index . 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are included for the 
following: 

(1) method for constructing an index structure for database ; 

and 

(2) method for searching a database for similar objects. 

USE - For classification media objects e.g. digital audio files, 
electronic representation of audio visual works. 

ADVANTAGE - Number similarity comparisons required to locate most 
similar vector in a combination is substantially reduced. Works well 
with vectors of very high dimensionality, thus solving dimensionality 
problems . 

DESCRIPTION OF DRAWING (S) - The figure shows the graph explaining 
the relationship between clusters and threshold, 
pp; 6 DwgNo 1/1 
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Semantic content representing method for retrieving documents in computer 
system, involves performing singular value decomposition and 
dimensionality reduction of matrix to form latent semantic indexed 
vector space 
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Number of Countries: 001 Number of Patents: 001 
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Abstract (Basic): US 20020103799 Al 

NOVELTY - A two dimensional matrix with columns representing 
documents and rows representing terms which include n-tuple term 
occurring in documents and elements related with number of occurrences 
of the term in the document, is formed. A latent semantic indexed 
vector space is formed by performing singular value decomposition and 
dimensionality reduction of the matrix. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are included for the 
following : 

(1) Conceptual similarity determining method for query and 
reference documents in computer system; 

(2) Conceptual similarity determining method for subject and 
reference documents; and 

(3) Query document representing method. 

USE - For retrieving data stored in database or in computer 
files, for collecting material of conceptual relatedness for proposal 
preparation, research management, legal brief development, document 
declassification . 

ADVANTAGE - The method searches a collection of one or more 
documents efficiently based on conceptual content even at sub-document 
level. The two documents which closely related by the concept is 
identified easily by the usage of different synonyms for documents in a 
database . 

DESCRIPTION OF DRAWING (S) - The figure illustrates the singular 
value decomposition operation of the latent semantic content 
representing method. 
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Abstract (Basic): WO 200257961 A2 

NOVELTY - Relevance of document to user's query is determined by 
calculating similarity coefficient based on structures of pair of 
query predicates and document predicates. Documents are autonomously 
clustered (140) using self-organizing neural network that provides 
coordinate system that makes judgments in non-subjective fashion. 

DETAILED DESCRIPTION - System determines relevance of document 
relative to user's query using comparison process. Input queries are 
parsed into query predicate structures using an ontological parser. The 
ontological parser parses a set of known documents to generate document 
predicate structures. A comparison of each query predicate structure 
with each document predicate structure is performed to determine a 
matching degree, represented by a real number. A multilevel modifier 
strategy is implemented to assign different relevance values to the 
different parts of each predicate structure match to calculate the 
predicate structure's matching degree. 

INDEPENDENT CLAIMS are also included for the following: 

(1) a clustering method using parsing and vectorizing, 

(2) a method of vectorizing a set of document predicate structures, 

(3) a relevancy ranking system, 

(4) a relevancy ranking system with a neural network, 

(5) a clustering system, 

(6) a question and answering system. 

USE - Relevancy ranking and clustering system for document queries, 
indexing and retrieval including on the Internet. 

ADVANTAGE - The system automates a document query process and 
enables the user to provide feedback in order to fine-tune the search 
process. The number vectors used for text representation are 
ontologically generated concept representations, with meaningful 
numerical relationships so closely related concepts have numerically 
similar representations while independent concepts have numerically 
different representations. Also, the concepts represented are in 
numerical form as part of complete predicate structures, rather than 
simple independent words. The vectorization method provides a way to 
represent both long and short queries with vector representations 
with same dimensions that permits faster clustering. The method also 
permits comparisons of large-scale patterns across the whole document 
rather than moving between small windows. 

DESCRIPTION OF DRAWING (S) - The block diagram represents a 
relevancy ranking system. 

Document clustering (140) 
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Abstract (Basic): WO 200146858 Al 

NOVELTY - Similar vectors are fast retrieved from a vector 
database of several hundreds of dimensions with reference to a single 
vector index according to the criterion of the inner product or 
distance, after specifying the similarity search range and the 
maximum number of similar vectors to be retrieved. For creating the 
vector index . 

USE - Vector index creating and similar vector searching 
method from vector database according to inner product criteria 

DESCRIPTION OF DRAWING (S) - Vector database (101) 

Sub- vector calculating means (102) 

Norm distribution determining means (103) 

Norm section list (104) 

Area number calculating means (105) 

Argument distribution determining means (106) 

Argument section list (107) 

Norm section number calculating means (108) 

Argument section number calculating means (109) 
Index data calculating means (110) 
Index creating means (111) 
Vector index (112) 
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Method of indexing database of stored object - involves conducting 
similarity search on indexed set of truncated transformed feature 
vectors to retrieve set of vectors which represent super-set of 
objects including desired objects and false positives 
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Abstract (Basic) : US 5647058 A 

The method involves applying a set of feature extraction functions 
to extract a set of feature vectors from the stored objects in the 
database . The set of feature extraction functions has a similarity 
measure applicable to the stored objects. The set of extracted feature 
vectors are transformed using an orthonorinal transform such that the 
similarity measure is preserved. 

The transformed feature vectors are truncated such that the 
entries which contribute little to the information of the transformed 
vectors are removed. The truncated feature vectors are indexed 
using a non-sequential point-access-method (PAM) . A similarity- 
search on the indexed set of truncated transformed feature vectors 
is conducted to retrieve a set of vectors which represent a superset 
of objects including desired objects and false positives. A secondary 
search is performed on the retrieved set of vectors to eliminate the 
false positives. 

ADVANTAGE - Achieves efficient and complete retrieval from 
database of high dimensionality points while guaranteeing completeness 
and reduces prosperity for false positives. 
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Abstract (Basic) : WO 200146858 Al 

NOVELTY - Similar vectors are fast retrieved from a vector database 
of several hundreds of dimensions with reference to a single vector 
index according to the criterion of the inner product or distance, 
after specifying the similarity search range and the maximum number of 
similar vectors to be retrieved. For creating the vector index . 

USE - Vector index creating and similar vector searching 
method from vector database according to inner product criteria 

DESCRIPTION OF DRAWING (S) - Vector database (101) 

Sub-vector calculating means (102) 

Norm distribution determining means (103) 

Norm section list (104) 

Area number calculating means (105) 

Argument distribution determining means (106) 

Argument section list (107) 

Norm section number calculating means (108) 
Argument section number calculating means (109) 
Index data calculating means (110) 
Index creating means (111) 

Vector index (112) 
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English Abstract 

Similar vectors are fast retrieved from a vector database of several 
hundreds of dimensions with reference to a single vector index 
according to the criterion of the inner product or distance, after 
specifying the similarity search range and the maximum number of similar 
vectors to be retrieved. For creating the vector index , each vector 

is decomposed into sub-vectors and featured by a norm section, an 
assigned area, and an argument section. For similarity search, a 
sub-query vector and a sub search range are determined from the query 
vector and the search range, similarity search in sub-space is carried 
out, and differences from the search range are cumulated to determine the 
upper limits. An accurate criterion having a higher upper limit is 
preferentially determined, thereby producing a final similarity search 
result . 
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Title: A vector based approach to color image retrieval 
Author: Androutsos, D . ; Plataniotis, K.N.; Venetsanopoulos, A.N. 
Corporate Source: Digit. Sign. /Image Processing Lab Department of 

Electrical Engineering University of Toronto, Toronto, Ont., M5S 3G4, 

Canada 
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Source: Proceedings of SPIE - The International Society for Optical 
Engineering v 3527 1998. p 497-504 
Publication Year: 1998 
CODEN: PSISDG ISSN: 0277-786X 
Language: English 

Document Type: CA; (Conference Article) Treatment: T; (Theoretical) 
Journal Announcement: 0305W4 

Abstract: In this paper we present a novel technique for image 
retrieval based on color. Our system is based on color segmentation where 
only a small number of representative color vectors are extracted from 
each image and used to build image indices . These vectors are then 
used with vector distance measures to determine similarity between a 
query color and a database image . We test numerous popular vector 
distance measures in our system, along with popular histogram techniques, 
and find that angular directional measures using our technique provide 
more accurate and perceptually relevant retrievals. 10 Refs. 

Descriptors: Image retrieval; Color image processing; Image 
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Journal Announcement: 0305W4 

Abstract: Many data-intensive applications, such as content-based 
retrieval of images or video from multimedia databases and similarity 
retrieval of patterns in data mining, require the ability of efficiently 
performing similarity queries . Unfortunately, the performance of 
nearest neighbor (NN) algorithms, the basis for similarity search , 
quickly deteriorates with the number of dimensions. In this paper we 
propose a method called Clustering with Singular Value Decomposition 
(CSVD) , combining clustering and singular value decomposition (SVD) to 
reduce the number of index dimensions. With CSVD, points are grouped into 
clusters that are more amenable to dimensionality reduction than the 
original dataset . Experiments with texture vectors extracted from 
satellite images show that CSVD achieves significantly higher 
dimensionality reduction than SVD alone for the same fraction of total 
variance preserved. Conversely, for the same compression ratio CSVD 
results in an increase in preserved total variance with respect to SVD 
(e.g., a 70% increase for a 20:1 compression ratio). Then, approximate NN 
queries are more efficiently processed, as quantified through experimental 
results. 28 Refs. 
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Source: Information Processing Letters v 80 n 2 Oct 31 2001. p 87-95 

Publication Year: 2001 

CODEN: IF PLAT ISSN: 0020-0190 

Language: English 

Document Type: JA; (Journal Article) Treatment: T; (Theoretical) 
Journal Announcement: 0110W4 

Abstract: In this paper, we investigate the problem of clustering 
multidimensional data sequences such as video streams . Each sequence is 
represented by a small number of hyper-rectangular clusters for subsequent 
indexing and similarity search processing. We present a linear 
clustering algorithm that guarantees the predefined level of clustering 
quality, and show its effectiveness via experiments on various video data 
sets, copy 2001 Elsevier Science B.V. All rights reserved. 9 Refs. 
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Journal Announcement: 0107W1 

Abstract: This paper proposes a new novel method for similarity search 
that supports time warping in large sequence databases . Time warping 
enables finding sequences with similar patterns even when they are of 
different lengths. Previous methods for processing similarity search 
that supports time warping fail to employ multi-dimensional indexes 
without false dismissal since the time warping distance does not satisfy 
the triangular inequality. Our primary goal is to innovate on search 
performance without permitting any false dismissal. To attain this goal, 
we devise a new distance function D//t//w//-//l//b that consistently 
underestimates the time warping distance and also satisfies the triangular 
inequality. D//t //w//-//l//b uses a 4-tuple feature vector that is 
extracted from each sequence and is invariant to time warping. For 
efficient processing of similarity search , we employ a 

multi-dimensional index that uses the 4-tuple feature vector as indexing 

attributes and D//t//w//-//l//b as a distance function. The extensive 
experimental results reveal that our method achieves significant speedup 
up to 43 times with real-world S&P 500 stock data and up to 720 times with 
very large synthetic data. 22 Refs. 
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Publication Year: 1999 
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Language: English 

Document Type: JA; (Journal Article) Treatment: T; (Theoretical) 
Journal Announcement: 9908W1 

Abstract: We present a new multi-dimensional access method for querying 
by similarity in databases of high-dimensional vectors. The query 
vector projection access method (QVPAM) addresses the shortcomings of other 
dimensionality reduction techniques by deriving the best transformation of 
the vectors at query-time. QVPAM creates a projection library that contains 
building blocks for constructing the transformations. QVPAM rapidly 
searches the projection library at query-time in order to select the set of 
projection elements that minimizes the work for processing the query. Since 
the selected set does not need to be complete, QVPAM effectively trades-off 
query precision and query response time. We describe QVPAM and demonstrate 
its performance in the content-based querying of a database of 
high-dimensional color histograms. (Author abstract) 15 Refs. 
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Journal Announcement: 9906W5 
• Abstract: Due to the growth of large data collections, information 
retrieval or database searching is of vital importance. Lexical matching 
techniques may retrieve irrelevant or inaccurate results because of 
synonyms and polysemous words, so effective concept-based techniques are 
needed. One such technique is latent semantic indexing (LSI) which uses a 

vector -space approach by identifying documents whose content is related 
to the user's query in order of similarity . LSI uses the singular value 
decomposition (SVD) of term-by-document matrix to encode the terms and 
documents in a vector-space model. Existing methods for removing terms or 
documents from the term-document space are either time consuming or do not 
sufficiently change the term-document relationships. This paper presents a 
new method for downdating, downdating the reduced model (or DRM) method, 
and discusses its implementation into the LSI plus plus software 
environment. The DRM method can be used to assess the effect that a term or 
document has on the clustering of relevant information in a collection and 
for the incorporation of user feedback in the existing LSI model. 
Implementing the DRM method within LSI plus plus not only provides 



downdating functionality, but is less time consuming than recomputing the 
SVD when removing a term, document or both. The DRM method is a viable 
algorithm for dynamic information modeling and retrieval. (Author abstract) 
20 Refs. 
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Abstract: Efficient indexing of high dimensional feature vectors is 
important to allow visual information systems and a number other 
applications to scale up to large databases . In this paper, we define 
this problem as ?similarity indexing 1 and describe the fundamental types of 
? similarity queries 1 that we believe should be supported. We also 
propose a new dynamic structure for similarity indexing called the 
similarity search tree or SS-tree. In nearly every test we performed on 
high dimensional data, we found that this structure performed better than 
the R*-tree. Our tests also show that the SS-tree is much better suited for 
approximate queries than the R*-tree. (Author abstract) 28 Refs. 

Descriptors: ^Indexing (of information); Query languages; Trees 
(mathematics); Information retrieval systems; Data structures 

Identifiers: Similarity indexing; Similarity search tree 
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Author: Davcev, Danco; Cakmakov, Dusan; Arnautovska, Vesna 

Corporate Source: ?Kiril & Metodi j 1 Univ, Skopje, Macedonia 

Conference Title: Proceedings of the 27th Hawaii International Conference 

on System Sciences (HICSS-27). Part 3 (of 5) 

Conference Location: Wailea, HI, USA Conference Date: 19940104-19940107 
Sponsor: University of Hawaii; University of Hawaii College of Business 

Administration; IEEE Computer Society; Association for Computing Machinery 
E.I. Conference No.: 20790 

Source: Proceedings of the Hawaii International Conference on System 
Sciences v 3 1994. Publ by IEEE, Computer Society Press, Los Alamitos, CA, 
USA, 94TH0607-2. p 581-589 

Publication Year: 1994 

CODEN: PHISD7 ISSN: 1060-3425 ISBN: 0-818 6-5070-2 
Language: English 

Document Type: CA; (Conference Article) Treatment: G; (General Review); 
T; (Theoretical) 

Journal Announcement: 9410W4 

Abstract: The basic elements of A Multimedia Cognitive-based Information 
Retrieval System called AMCIRS which integrates image and text 
information have been described elsewhere. The AMCIRS query based mechanism 
is based on multimedia objects content search using the vector model. The 
content search process is deduced to the similarity estimation between 
query and index vectors . The main objective of this paper is to 
introduce the similarity estimation model for geometrical objects as a part 
of a query mechanism of AMCIRS. Our model for polygon similarity estimation 
introduces a numerical measure of similarity between two polygons and gives 
acceptable results for all polygon forms and any number of vertices. The 
algorithm based on this model as well as the simulation results are also 
given. (Author abstract) 27 Refs. 

Descriptors: ^Information retrieval systems; Query languages; Information 
retrieval; Information services; Mathematical models; Geometry; Vectors; 
Algorithms; Information management; Fourier transforms 

Identifiers: Query based mechanisms; Geometrical objects; A multimedia 
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Document Type: CA; (Conference Article) Treatment: A; (Applications); G 
; (General Review); T; (Theoretical) 
Journal Announcement: 94 06W3 

Abstract: A Multimedia Cognitive-based Information Retrieval System 
called AMCIRS which integrates image and text information has been 
described in left bracket 11 right bracket , left bracket 12 right bracket 
. The AMCIRS query based mechanism is based on multimedia objects content 
search using the vector model. The content search process is deduced to 
the similarity estimation between query and index vectors . The main 
objective of this paper is to present an application of AMCIRS in 
Mineralogy. The experimental evaluation of AMCIRS retrieval effectiveness 
is also given. The retrieval effectiveness is expressed by recall and 
precision parameters which are the standard measures for the effectivity of 
the information retrieval systems. We confirmed our assumption that 
multiple media retrieval has advantages with respect to single media 
retrieval. (Author abstract) 26 Refs. 
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; Query vectors ; Similarity estimation; Retrieval effectiveness 
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Abstract: In this paper, we present the experiments in retrieval of 
multimedia mineral information using AMCIRS (A Multimedia Cognitive-based 
Information Retrieval System) . The AMCIRS query based mechanism is based on 
a multimedia objects content search using the vector model. Each vector is 
composed of text and image objects. The image objects in the vectors 
are image object contours, represented by polygonal approximations. The 
content search process is deduced to the similarity estimation between 
the MM query and MM index vectors . The similarity function for 
image objects is based on the polygon similarity estimation. The 
experimental evaluation of AMCIRS retrieval effectiveness is expressed by 
the recall and precision parameters. Possible advantages of multiple media 
retrieval with respect to the single medium retrieval are also investigated 
and explicitly represented by the recall-precision diagrams. (Author 



abstract) 27 Ref s . 

Descriptors: ^Information retrieval systems; Cognitive systems; Minerals; 
Online searching; Query languages; Mathematical models; Information 
analysis; Knowledge based systems; Parameter estimation; Approximation 
theory 

Identifiers: Multimedia cognitive based information retrieval system; 
Polygon similarity estimation; Content search ; Information recall; 
Precision 

Classification Codes: 

903.3 (Information Retrieval & Use); 723.5 (Computer Applications); 
482.2 (Minerals); 723.3 (Database Systems); 903.2 (Information 
Dissemination) 

903 (Information Science); 723 (Computer Software); 482 (Mineralogy & 
Petrology); 922 (Statistical Methods) 

90 (GENERAL ENGINEERING); 72 (COMPUTERS & DATA PROCESSING); 48 
(ENGINEERING GEOLOGY); 92 (ENGINEERING MATHEMATICS) 



10/5/11 (Item 1 from file: 35) 

DIALOG (R) File 35 : Dissertation Abs Online 

(c) 2003 ProQuest Inf o&Learning . All rts. reserv. 

01771492 ORDER NO: AADAA-IC801556 

Image texture analysis with fast similarity search for content-based 
retrieval and navigation 

Author: Kuan, Joseph 
Degree: Ph.D. 
Year: 1998 

Corporate Source/Institution: University of Southampton (United Kingdom) 
(5036) 
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One of the main challenges of multimedia and hypermedia research is 
the effective use of the media content for retrieval and navigation in 
multimedia environments. This thesis is concerned with the use of texture 
as one of the keys for content based retrieval (CBR) and content based 
navigation (CBN) . Other authors have proposed texture analysis procedures 
and an initial aim was to identify a versatile texture representation which 
is effective over a very wide range of textures and which could be used 
efficiently in the context of CBR and CBN. In order to index the 
multidimensional feature vectors representing texture efficiently, this 
thesis has also focused on issues of multidimensional indexing for fast 
similarity search . 

This thesis proposes a novel texture representation method which uses 
the edge and plain region information from texture patterns. The 
information is used to evaluate contrast across edges, the mean greylevel 
of plain regions and the conditional probability matrix of edge directions 
and plain regions as features. A weighted Euclidean measurement for this 
method is proposed which gives better matching than the standard Euclidean 
measure. The new representation is compared with a range of previous 
texture representation schemes using a wide range of texture patterns and 
its classification properties and speed performance are shown to be an 
improvement on the other schemes. Since texture is typically represented by 
a multidimensional feature vector , this thesis investigates 
multidimensional indexing strategies and <italic>k</italic>nn retrieval 
methods and proposes new and more efficient approaches in the context of 
multimedia information handling. 

Two different multidimensional indexing approaches are explored in 
this thesis; the R*-tree and the Hilbert R-tree. Data object retrieval and 
range search performance are compared in various aspects, including the 
number of dimensions, nature of databases and database size. 
<italic>k</italic> nearest neighbours (<italic>k</italic>nn) similarity 
search is significant for image based CBR and CBN in multimedia systems, 
and a new algorithm for <italic> k</italic>nn search is proposed which is 
an improvement over previous approaches for image based CBR and CBN 



applications . 

The novel texture representation technique, fast indexing R-tree and 
the enhanced R-tree similarity search technique are integrated into an 
open hypermedia system which offers content based retrieval and navigation 
for multimedia data. The thesis concludes with examples of the use of the 
system for texture based image retrieval and also texture based 
navigation from images to other media types. 
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We propose a feature-based indexing method for spatial data objects. 
The aim is to efficiently retrieve the data objects as well as similar 
objects with a given query object in a spatial database environment. Our 
method extracts some features from each data object in order to build an 
index tree. A broad range of problems and issues such as indexing and 
modeling, similarity matching, transformations, features and spatial access 
methods must be dealt with in any feature-based indexing method. Our work 
consists of two parts. First, we propose a framework for feature-based 
indexing of image data and apply our method to the damage zone shapes of 
materials. A set of generic features which are invariant to rotation, 
translation and scaling for the sake of similarity matching are proposed. 
These features form a feature vector for each image . The feature vectors 
are extended with some domain specific features. The feature vectors are 
used to build the index structure. Any multi-dimensional point access 
method can then be used to build the index. However we use a variant of the 
K-D-B tree. Weighted Euclidean distance is used as similarity measure. Each 
feature in the feature vector is associated with a weight, based on the 
application, which is used in the search process for similarity 
matching. A formula is proposed to find the similarity of nodes in the 
index tree with a given query shape. This formula is used to prune the 
search tree in the query processing. 

In the second part, we propose two inter-sequence matching methods for 
exact and similarity matching of image sequences. We assume that the 
image sequence matching problem is transformed into matching sequences of 
real numbers. The methods do not require sequences to have the same length. 
The first method tries to find the Longest Matching Subsequences (LMS) of 
two sequences to be matched and uses a modified version of the Longest 
Common Subsequence (LCS) method for actual matching. In the second method, 
a modified version of restricted edit distance is used for matching. We 
also propose a feature-based indexing mechanism to filter out those 
sequences which are matching candidates with a given query sequence from a 
large data set. Like all other feature-based indexing methods, our method 
maps each sequence into a point in K dimensional space, where K is the 
number of extracted features for the sequence. It operates in two phases, 
hypothesizing and verification. Lengths and moments (mean and variance) of 
sequences are used as features. Experimental results indicate that the 
features and proposed method for query processing perform well as a filter. 
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Author (s) : Ordonez, J.R.; Cazuguel, G.; Puentes, J.; Solaiman, B.; 

Cauvin, J.M.; Roux, C. 

Author Affiliation: INSERM, Brest, France 

Conference Title: 2001 Conference Proceedings of the 23rd Annual 
International Conference of the IEEE Engineering in Medicine and Biology 
Society (Cat. No . 01CH37272 ) Part vol.3 p. 2465-8 vol.3 

Publisher: IEEE, Piscataway, NJ, USA 

Publication Date: 2001 Country of Publication: USA 4 vol. 4132 pp. 
ISBN: 0 7803 7211 5 Material Identity Number: XX-2002-02147 

U.S. Copyright Clearance Center Code: 0-7803-7211-5/01/$ 17 . 00 
Conference Title: 2001 Conference Proceedings of the 23rd Annual 

International Conference of the IEEE Engineering n Medicine and Biology 

Society 

Conference Date: 25-28 Oct. 2001 Conference Location: Istanbul, Turkey 
Medium: Also available on CD-ROM in PDF format 
Language: English Document Type: Conference Paper (PA) 
Treatment: Practical (P) ; Theoretical (T) ; Experimental (X) 
Abstract: Addresses the problem of efficient image retrieval from a 
compressed image database , using information derived from the 

compression process. Images in the database are compressed applying two 
approaches: vector quantization (VQ) and quadtree image decomposition 
Both are based on Konohen's self-organizing feature maps (SOFM) for 
creating vector quantization codebooks. However, while VQ uses one codebook 
of one resolution to compress the images , Quadtree decomposition uses 
simultaneously 4 codebooks of four different resolutions. Image indexing 
is implemented by generating a feature vector (FV) for each compressed 
image . Accordingly, images are retrieved by means of FVs similarity 
evaluation between the query image and the images in the database , 
depending on a distance measure. Three distance measures have been analyzed 
to assess FV index similarity: Euclidean, intersection and correlation 
distances. Distance measures efficiency retrieval is evaluated for 
different VQ resolutions and different quadtree image descriptors. 
Experimental results using real data, esophageal ultrasound and eye 
angiography images , are presented. (8 Refs) 
Subfile: B C 
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Abstract: This paper discusses an index-based subsequence matching that 
supports time warping in large sequence databases . Time warping enables 
finding sequences with similar patterns even when they are of different 
lengths. In our earlier work, we suggested an efficient method for whole 
matching under time warping. This method constructs a multidimensional 

index on a set of feature vectors , which are invariant to time warping, 
from data sequences. For filtering at feature space, it also applies a 
lower-bound function, which consistently underestimates the time warping 
distance as well as satisfies the triangular inequality. We incorporate the 
prefix-querying approach based on sliding windows into the earlier 
approach. For indexing , we extract a feature vector from every 
subsequence inside a sliding window and construct a multidimensional index 

using a feature vector as indexing attributes. For query processing, 
we perform a series of index searches using the feature vectors of 
qualifying query prefixes. Our approach provides effective and scalable 
subsequence matching even with a large volume of a database . We also 
prove that our approach does not incur false dismissal. To verify the 
superiority of our method, we perform extensive experiments. The results 
reveal that our method achieves significant speedup with real-world S&P 500 
stock data and with very large synthetic data. (21 Refs) 
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Treatment: Practical (P) 

Abstract: We propose multi-precision similarity matching where the image 
is divided into a number of subblocks, each with its associated color 



histogram. We present experimental results showing that the spatial 
distribution information recorded by multiprecision color histograms helps 
to make similarity matching more precise. We also show that sub- image 
queries are much better supported with multi-precision color histograms. To 
minimize the overhead, we employ a filtering scheme based on the 
3-dimensional average color vectors. We provide a formal result proving 
that filtering with multi-precision color histograms is complete. Finally, 
we develop a novel extendible hashing structure for indexing the average 
color vectors . We give experimental results showing that the proposed 
structure significantly outperforms the SR-tree. (21 Refs) 
Subfile: C 
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Abstract: Similarity queries on complex objects are usually 

translated into searches among their feature vectors . The paper studies 
indexing techniques for very high-dimensional (e.g., in hundreds) vectors 
that are sparse or quasi-sparse, i.e., vectors each having only a small 
number of non-zero* or significant values. Based on the R-tree, the paper 
introduces the xS-tree that uses lossy compression of bounding regions to 
guarantee a reasonable minimum fan-out within the allocated storage space 
for each node. In addition, the paper studies the performance and 
scalability of the xS-tree via experiments. (29 Refs) 
Subfile: C 
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Abstract: This paper describes a computationally efficient method for 
fast retrieval of color images of multimedia and imaging databases . 
Although the proposed algorithm can operate in an n-dimensional feature 
space for search, in our experiments we use only one 3D vector as key for 

indexing and searching color pictures of the selected archives. A new 
feature extraction and matching technique is developed based on the 
first-order statistics of color image data. Eigenvalue analysis provides 
an effective way of reducing 3-D color data to a one-dimensional (1-D) 
array. This feature extraction and reduction step is performed only once 
when an (R,G,B) color picture is submitted for storage or query . For a 

similarity measure, the Tanimoto coefficient is selected to be a 
computationally high performance matching algorithm to evaluate the search 
results. It is shown that the idea of projection-based retrieval is similar 
to the well-known histogram intersection operation of Swain and Ballard 
(1991) . The algorithm described has been tested on eleven different 

databases , each of which consists of various color images of different 
scenes stored in a content addressable stack. The efficacy of retrieval was 
determined using the percentage efficiency measure eta =p/P, where p is the 
number of similar pictures retrieved in a short list and P is the total 
number of similar pictures in an archive. The experimental results yield 
almost 90% average retrieval efficiency for the eleven databases searched 
with the 3-D index or key vector . (16 Refs) 
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Author Affiliation: Dept. of Inf. & Comput . Sci . , California Univ., 
Irvine, CA, USA 

Conference Title: Proceedings. Eleventh International Conference on 
Scientific and Statistical Database Management p. 56-67 
Publisher: IEEE Comput. Soc, Los Alamitos, CA, USA 

Publication Date: 1999 Country of Publication: USA xiii+287 pp. 
ISBN: 0 7695 0046 3 Material Identity Number: XX-1999-02058 

U.S. Copyright Clearance Center Code: 0 7695 0046 3/99/510.00 
Conference Title: Proceedings of Eleventh International Conference on 

Scientific and Statistical Database Management 1 99 

Conference Sponsor: Case Western Univ.; ACM SIGMOD; VLDB Endowment 
Conference Date: 28-30 July 1999 Conference Location: Cleveland, OH, 

USA 

Language: English Document Type: Conference Paper (PA) 
Treatment: Practical (P) ; Theoretical (T) 

Abstract: Addresses the problem of similarity searching in large 
time-series databases . We introduce a novel indexing algorithm that 
allows faster retrieval. The index is formed by creating bins that contain 
time series subsequences of approximately the same shape. For each bin, we 
can quickly calculate a lower bound on the distance between a given query 
and the most similar element of the bin. This bound allows us to search the 
bins in best-first order, and to prune some bins from the search space 
without having to examine the contents. Additional speedup is obtained by 
optimizing the data within the bins such that we can avoid having to 
compare the query to every item in the bin. We call our approach STB (Shape 
To Bit- vector ) indexing , and experimentally validate it on space 
telemetry, medical and synthetic data, demonstrating approximately an 
order-of -magnitude speedup. (25 Refs) 
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Descriptors: database indexing; database theory; medical information 
systems; query processing; software performance evaluation; space telemetry 
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indexing algorithm; fast retrieval; bins; time series subsequences; lower 
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telemetry data; medical data; synthetic data 
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Conference Title: Proceedings 1998 International Conference on Image 
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Publisher: IEEE Comput. Soc, Los Alamitos, CA, USA 

Publication Date: 1998 Country of Publication: USA 3 vol. 
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U.S. Copyright Clearance Center Code: 0 8186 8821 l/98/$10.00 

Conference Title: Proceedings of IPCIP f 98 International Conference on 

Image Processing 

Conference Sponsor: IEEE Signal Process. Soc 

Conference Date: 4-7 Oct. 1998 Conference Location: Chicago, IL, USA 
Language: English Document Type: Conference Paper (PA) 
Treatment: Theoretical (T) ; Experimental (X) 

Abstract: We address the issue of image database retrieval based on 
color using various vector distance metrics. Our system is based on color 
segmentation where only a few representative color vectors are extracted 
from each image and used as image indices . These vectors are then 



used with vector distance measures to determine similarity between a 
query color and a database image . We test numerous popular vector 
distance measures in our system and find that directional measures provide 
the most accurate and perceptually relevant retrievals. (7 Refs) 
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Conference Title: Proceedings of 5th International Conference on 

Information and Knowledge Management 

Conference Sponsor: ACM; NASA; Bell Commun.; NSF; AAAI; IEEE Comput . Soc. 
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USA 

Language: English Document Type: Conference Paper (PA) 
Treatment: Practical (P) 

Abstract: Presents an indexing method that can be used to search a large 
collection of cursive handwriting. The basic idea is to segment each 
cursive string into a set of strokes. Each of these strokes can be 
described with a set of features and can thus be stored as points in the 
feature space. The Karhunen-Loeve transform is then used to minimize the 
number of features used (the data dimensionality) , and thus the index 

size. Feature vectors are stored in an R-tree. Similarity searching 
can be performed by executing a few range queries and then applying a 
simple voting algorithm to the output in order to select the strings that 
are most similar to the query . The proposed index can support similarity 
queries as well as substring matching. It is resilient to the kind of 
errors that result from the segmentation process, namely stroke 
insertion/deletion and m-n substitution. The proposed index achieves a 
substantial saving in search time over a sequential search. Moreover, it 
improves the matching rate by up to 4 6% over the sequential search. (11 

.Refs) 
Subfile: C 

Descriptors: handwriting recognition; image segmentation; indexing; 
query processing; software performance evaluation; string matching; 
transforms; tree searching; visual databases 

Identifiers: fast retrieval; cursive handwriting; indexing method; 
cursive string segmentation; stroke set; feature space; Karhunen-Loeve 
transform; feature number minimization; data dimensionality minimization; 
index size minimization; feature vectors; R-tree; similarity searching ; 
range queries; voting algorithm; similarity queries ; substring matching 
; error resilience; stroke insertion; stroke deletion; m-n substitution; 
search time; matching rate 

Class Codes: C6160S (Spatial and pictorial databases); C6120 (File 



organisation) ; C5260B (Computer vision and image processing techniques) 
Copyright 1997, IEE 



10/5/22 (Item 10 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2003 Institution of Electrical Engineers. All rts. reserv. 

4725434 INSPEC Abstract Number: C9409-6160S-018 
Title: A query based mechanism for multimedia information retrieval 

Author(s): Davcev, D.; Cakmakov, D.; Arnautovski, V. 

Author Affiliation: Fac. of Electr. Eng. & Comput . Sci . , Skopje Univ., 
Macedonia 
p. 21-38 

Publisher: Arizona State Univ, Tempe, AZ, USA 
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Conference Date: 7 Feb. 1992 Conference Location: Tempe, AZ, USA 
Language: English Document Type: Conference Paper (PA) 
Treatment: Practical (P) 

Abstract: The basic elements of a multimedia cognitive-based information 
retrieval system called AMCIRS which integrates image and text 
information have been described by D. Davcev et al . (1991). We extended the 
AMCIRS query based mechanism which is based on multimedia objects content 
search using the vector model. The content search process is deduced to 
the similarity estimation between query and index vectors . An 
important part of this process is the similarity estimation between 
geometrical image forms found in the query and index vectors . The 
main objective of the paper is to introduce the similarity estimation model 
(SEM) for geometrical objects as a part of a query mechanism of AMCIRS. The 
model for polygon similarity estimation introduces a numerical measure of 
similarity between two polygons and gives acceptable results for all 
polygon forms and any number of vertices. The algorithm based on this model 
as well as the simulation results are also given. (20 Refs) 
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002.5:005 

LANGUAGE: Japanese COUNTRY OF PUBLICATION: Japan 

DOCUMENT TYPE: Journal 
ARTICLE TYPE: Original paper 
MEDIA TYPE: Printed Publication 

ABSTRACT: Recently, requirement of fast similarity - search for images 



is increasing. The indexing method for n-dimensional vector has 
great interest, because methods of the similarity - search for 
images usually use n-dimensional vectors have feat ures of the image 
to calculate similarity. A feature vector of an image often has 
dimensions over a hundred but precedent method of the similarity - 
search are not work effectively for High-dimensional vectors. In this 
paper we propose a method of the similarity - search who can work 
effectively for High-dimensional vectors, (author abst . ) 

DESCRIPTORS: image retrieval; database ; tree structure; similarity; 
vector space; response time; accuracy; tree (graph 

IDENTIFIERS: similar image retrieval; B tree 

BROADER DESCRIPTORS: retrieval; structure; property; mathematical space; 

space; time; degree; subgraph; graph 
CLASSIFICATION CODE (S) : JD03030U; JE04010I; AC06020S 



10/5/24 (Item 2 from file: 94) 

DIALOG (R) File 94 : JICST-EPlus 

(c)2003 Japan Science and Tech Corp(JST). All rts. reserv. 

04845768 JICST ACCESSION NUMBER: 01A0297612 FILE SEGMENT: JICST-E 
Visual Database . Texture Image Retrieval Based on the Hierarchical 
Correlations of Wavelet Coefficients. 

KOBAYAKAWA MICHIHIRO (1); HOSHI MAMORU (1); OMORI TADASHI (1) 
(1 ) Dentsudai Daigakuinjohoshisutemugakukenkyuka 

Joho Shori Gakkai Ronbunshi (Transactions of Information Processing Society 

of Japan), 2001, VOL. 42, NO. SIG1 (T0D8 ) , PAGE. 12-20, FIG. 2, TBL.3, REF.ll 
JOURNAL NUMBER: Z0778AAZ ISSN NO: 0387-5806 

UNIVERSAL DECIMAL CLASSIFICATION: 681.3:061.68 681.3:621.397.3 
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LANGUAGE: Japanese COUNTRY OF PUBLICATION: Japan 

DOCUMENT TYPE: Journal 
ARTICLE TYPE: Original paper 
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ABSTRACT: In this paper we propose a robust texture image retrieval using 
hierarchical relations between the decomposed subimages obtained by 
wavelet transform. Key idea is to describe texture information in terms 
of the hierarchical correlations between the wavelet coefficients of 
the adjacent level. Firstly, we express the pyramidal structure of 
wavelet coefficients by associating the nodes of a complete quad tree 
with the wavelet coefficients. Secondly, we define a hierarchical 
dissimilarity vector between a parent node and his child, to express a 
hierarchical relation between them. Thirdly, to describe a relation 
among child nodes, we compute a covariance matrix of the dissimilarity 
vectors. We associate the covariance matrix with the parent node. We 
define the texture vector by the diagonal of element of the covariance 
matrix. And then define the texture feature vector of level 1 by the 
pair of the mean and the standard deviation of texture vectors of level 
1. Finally, by applying the discriminant analysis to the set of the 
texture feature vectors , we make an effective index of the 
database . For retrieving similar images , we use the k-nearest 
neighbor search in the index space. The similarity between two 
images is defined by the Euclidean distance between the corresponding 
feature vectors of the images . To evaluate the performance of the 
retrieval, we made experiments on "Cloth Collections" consisting of 51 
textile patterns with 10 different resolutions ( image size is 
1024*1024 pixels) . The experiments showed that the performance of 
retrieval is good and that the proposed method is robust with respect 
to resolution, (author abst.) 
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ABSTRACT: This paper describes the similarity search index VA-TREE, 
which uses the vector approximation method to represent internal and 
leaf nodes in a tree structure index. Our experiment on 16-dimensional 
hue, 16-dimensional intensity and 24-dimensional shape data used in an 
actual image retrieval application shows the advantage over the flat 
structure index VA-File, which uses vector approximation as well. 
The experiment also shows that the number of accessing feature vectors 
on a disk in VA-TREE is not affected by data distribution as much as 
VA-File or VAM Split R-tree. (author abst . ) 
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ABSTRACT: For the similarity retrieval based on feature vectors , how to 
construct an index of feature vectors to improve the retrieval 
efficiency has become an important topic. In this article we introduce 
a newly developed index method of feature vectors , which is 
composed of "recurrence clustering" and "removal search strategy." 
Recurrence clustering is a classification method used to construct a 
tree-like index of feature vectors based on similarities between 
feature vectors, and removal search strategy is a method 
tailor-developed to suit for searching the index structure constructed 
by recurrence clustering. The similarity retrieval based on feature 
vectors can be efficiently improved by this new approach. That is, the 
retrieval cost of this approach is less than that of linear associative 
retrieval strategy when the recall ratio is less than 10%. The 



effectiveness of the approach was confirmed in relative experiments 
using feature vectors extracted from full-text, (author abst . ) 
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